12 research outputs found
On How AI Needs to Change to Advance the Science of Drug Discovery
Research around AI for Science has seen significant success since the rise of
deep learning models over the past decade, even with longstanding challenges
such as protein structure prediction. However, this fast development inevitably
made their flaws apparent -- especially in domains of reasoning where
understanding the cause-effect relationship is important. One such domain is
drug discovery, in which such understanding is required to make sense of data
otherwise plagued by spurious correlations. Said spuriousness only becomes
worse with the ongoing trend of ever-increasing amounts of data in the life
sciences and thereby restricts researchers in their ability to understand
disease biology and create better therapeutics. Therefore, to advance the
science of drug discovery with AI it is becoming necessary to formulate the key
problems in the language of causality, which allows the explication of
modelling assumptions needed for identifying true cause-effect relationships.
In this attention paper, we present causal drug discovery as the craft of
creating models that ground the process of drug discovery in causal reasoning.Comment: Main paper: 6 pages, References: 1.5 pages. Main paper: 3 figure
On the Tractability of Neural Causal Inference
Roth (1996) proved that any form of marginal inference with probabilistic
graphical models (e.g. Bayesian Networks) will at least be NP-hard. Introduced
and extensively investigated in the past decade, the neural probabilistic
circuits known as sum-product network (SPN) offers linear time complexity. On
another note, research around neural causal models (NCM) recently gained
traction, demanding a tighter integration of causality for machine learning. To
this end, we present a theoretical investigation of if, when, how and under
what cost tractability occurs for different NCM. We prove that SPN-based causal
inference is generally tractable, opposed to standard MLP-based NCM. We further
introduce a new tractable NCM-class that is efficient in inference and fully
expressive in terms of Pearl's Causal Hierarchy. Our comparative empirical
illustration on simulations and standard benchmarks validates our theoretical
proofs.Comment: Main paper: 8 pages, References: 2 pages, Appendix: 5 pages. Figures:
5 main, 2 appendi
Causal Parrots: Large Language Models May Talk Causality But Are Not Causal
Some argue scale is all what is needed to achieve AI, covering even causal
models. We make it clear that large language models (LLMs) cannot be causal and
give reason onto why sometimes we might feel otherwise. To this end, we define
and exemplify a new subgroup of Structural Causal Model (SCM) that we call meta
SCM which encode causal facts about other SCM within their variables. We
conjecture that in the cases where LLM succeed in doing causal inference,
underlying was a respective meta SCM that exposed correlations between causal
facts in natural language on whose data the LLM was ultimately trained. If our
hypothesis holds true, then this would imply that LLMs are like parrots in that
they simply recite the causal knowledge embedded in the data. Our empirical
analysis provides favoring evidence that current LLMs are even weak `causal
parrots.'Comment: Published in Transactions in Machine Learning Research (TMLR)
(08/2023). Main paper: 17 pages, References: 3 pages, Appendix: 7 pages.
Figures: 5 main, 3 appendix. Tables: 3 mai
Pearl Causal Hierarchy on Image Data: Intricacies & Challenges
Many researchers have voiced their support towards Pearl's counterfactual
theory of causation as a stepping stone for AI/ML research's ultimate goal of
intelligent systems. As in any other growing subfield, patience seems to be a
virtue since significant progress on integrating notions from both fields takes
time, yet, major challenges such as the lack of ground truth benchmarks or a
unified perspective on classical problems such as computer vision seem to
hinder the momentum of the research movement. This present work exemplifies how
the Pearl Causal Hierarchy (PCH) can be understood on image data by providing
insights on several intricacies but also challenges that naturally arise when
applying key concepts from Pearlian causality to the study of image data.Comment: Main paper: 9 pages, References: 2 pages. Main paper: 7 figure
Can Linear Programs Have Adversarial Examples? A Causal Perspective
The recent years have been marked by extended research on adversarial
attacks, especially on deep neural networks. With this work we intend on posing
and investigating the question of whether the phenomenon might be more general
in nature, that is, adversarial-style attacks outside classification.
Specifically, we investigate optimization problems starting with Linear
Programs (LPs). We start off by demonstrating the shortcoming of a naive
mapping between the formalism of adversarial examples and LPs, to then reveal
how we can provide the missing piece -- intriguingly, through the Pearlian
notion of Causality. Characteristically, we show the direct influence of the
Structural Causal Model (SCM) onto the subsequent LP optimization, which
ultimately exposes a notion of confounding in LPs (inherited by said SCM) that
allows for adversarial-style attacks. We provide both the general proof
formally alongside existential proofs of such intriguing LP-parameterizations
based on SCM for three combinatorial problems, namely Linear Assignment,
Shortest Path and a real world problem of energy systems.Comment: Main paper: 9 pages, References: 2 page, Supplement: 2 pages. Main
paper: 2 figures, 3 tables, Supplement: 1 figure, 1 tabl
Tearing Apart NOTEARS: Controlling the Graph Prediction via Variance Manipulation
Simulations are ubiquitous in machine learning. Especially in graph learning,
simulations of Directed Acyclic Graphs (DAG) are being deployed for evaluating
new algorithms. In the literature, it was recently argued that
continuous-optimization approaches to structure discovery such as NOTEARS might
be exploiting the sortability of the variable's variances in the available data
due to their use of least square losses. Specifically, since structure
discovery is a key problem in science and beyond, we want to be invariant to
the scale being used for measuring our data (e.g. meter versus centimeter
should not affect the causal direction inferred by the algorithm). In this
work, we further strengthen this initial, negative empirical suggestion by both
proving key results in the multivariate case and corroborating with further
empirical evidence. In particular, we show that we can control the resulting
graph with our targeted variance attacks, even in the case where we can only
partially manipulate the variances of the data.Comment: Main paper: 5.5 pages, References: 1 page, Supplement: 2 pages. Main
paper: 3 figures, Supplement: 1 figure, 1 tabl
Machines Explaining Linear Programs
There has been a recent push in making machine learning models more
interpretable so that their performance can be trusted. Although successful,
these methods have mostly focused on the deep learning methods while the
fundamental optimization methods in machine learning such as linear programs
(LP) have been left out. Even if LPs can be considered as whitebox or clearbox
models, they are not easy to understand in terms of relationships between
inputs and outputs. As a linear program only provides the optimal solution to
an optimization problem, further explanations are often helpful. In this work,
we extend the attribution methods for explaining neural networks to linear
programs. These methods explain the model by providing relevance scores for the
model inputs, to show the influence of each input on the output. Alongside
using classical gradient-based attribution methods we also propose a way to
adapt perturbation-based attribution methods to LPs. Our evaluations of several
different linear and integer problems showed that attribution methods can
generate useful explanations for linear programs. However, we also demonstrate
that using a neural attribution method directly might come with some drawbacks,
as the properties of these methods on neural networks do not necessarily
transfer to linear programs. The methods can also struggle if a linear program
has more than one optimal solution, as a solver just returns one possible
solution. Our results can hopefully be used as a good starting point for
further research in this direction.Comment: Main paper: 9.5 pages, References: 2.5 pages, Supplement: 6 pages.
Main paper: 5 figures, 4 tables, Supplement: 3 figures, 6 table
Towards a Solution to Bongard Problems: A Causal Approach
To date, Bongard Problems (BP) remain one of the few fortresses of AI history
yet to be raided by the powerful models of the current era. We present a
systematic analysis using modern techniques from the intersection of causality
and AI/ML in a humble effort of reviving research around BPs. Specifically, we
first compile the BPs into a Markov decision process, then secondly pose causal
assumptions on the data generating process arguing for their applicability to
BPs, and finally apply reinforcement learning techniques for solving the BPs
subject to the causal assumptions.Comment: Main paper: 5.5 pages, References: 1 page, Supplement: 1 page. Main
paper: 5 figures, Supplement: 3 figure